Optimal learning and experimentation in bandit problems
نویسندگان
چکیده
منابع مشابه
Local Bandit Approximation for Optimal Learning Problems
In general, procedures for determining Bayes-optimal adaptive controls for Markov decision processes (MDP's) require a prohibitive amount of computation-the optimal learning problem is intractable. This paper proposes an approximate approach in which bandit processes are used to model, in a certain "local" sense, a given MDP. Bandit processes constitute an important subclass of MDP's, and have ...
متن کاملOptimal Adaptive Learning in Uncontrolled Restless Bandit Problems
In this paper we consider the problem of learning the optimal policy for uncontrolled restless bandit problems. In an uncontrolled restless bandit problem, there is a finite set of arms, each of which when pulled yields a positive reward. There is a player who sequentially selects one of the arms at each time step. The goal of the player is to maximize its undiscounted reward over a time horizo...
متن کاملBandit Problems and Online Learning
In this section, we consider problems related to the topic of online learning. In particular, we are interested in problems where data is made available sequentially, and decisions must be made or actions taken based on the data currently available. This is to be contrasted with many problems in optimization and model fitting, where the data under consideration is available at the start. Furthe...
متن کاملQ-Learning for Bandit Problems
Multi-armed bandits may be viewed as decompositionally-structured Markov decision processes (MDP's) with potentially very large state sets. A particularly elegant methodology for computing optimal policies was developed over twenty ago by Gittins Gittins & Jones, 1974]. Gittins' approach reduces the problem of nding optimal policies for the original MDP to a sequence of low-dimensional stopping...
متن کاملSocial Learning in One-arm Bandit Problems
The copyright to this Article is held by the Econometric Society. It may be downloaded, printed and reproduced only for educational or research purposes, including use in course packs. No downloading or copying may be done for any commercial purpose without the explicit permission of the Econometric Society. For such commercial purposes contact the Office of the Econometric Society (contact inf...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Economic Dynamics and Control
سال: 2002
ISSN: 0165-1889
DOI: 10.1016/s0165-1889(01)00028-8